Analyzing English-Spanish Named-Entity enhanced Machine Translation
نویسندگان
چکیده
Translation of named-entities (NEs) is an issue in SMT. In this paper we analyze the errors when translating NEs with a SMT system from English to Spanish. We train on Europarl and test on News Commentary, focusing on entities correctly recognized by an automatic NE recognition system. The automatic systems translate around 85% NEs correctly, leaving a small margin for improving performance. In addition, we implement a purpose-build NE translator and integrate it in the SMT system, yielding a small but significant improvement in BLEU score. Our analysis shows that, contrary to similar systems translating from Chinese to English, there was no improvement in NE translation, prompting further work.
منابع مشابه
Automatic acquisition of Named Entities for Rule-Based Machine Translation∗
This paper proposes to enrich RBMT dictionaries with Named Entities (NEs) automatically acquired from Wikipedia. The method is applied to the Apertium English–Spanish system and its performance compared to that of Apertium with and without handtagged NEs. The system with automatic NEs outperforms the one without NEs, while results vary when compared to a system with handtagged NEs (results are ...
متن کاملDetection of Non-Native Named Entities Using Prosodic Features for Improved Speech Recognition and Translation
In this work, we describe the use of acoustic-prosodic features to detect and localize non-native named entities spoken by a native speaker in the target language (English) for the purpose of improved speech recognition and translation. The exaggerated variation in accent and duration introduced by the speaker for non-native names is exploited in the detection process through the use of prosodi...
متن کاملبهبود شناسایی موجودیتهای نامدار فارسی با استفاده از کسره اضافه
Named entity recognition is a process in which the people’s names, name of places (cities, countries, seas, etc.) and organizations (public and private companies, international institutions, etc.), date, currency and percentages in a text are identified. Named entity recognition plays an important role in many NLP tasks such as semantic role labeling, question answering, summarization, machine ...
متن کاملImproved Named Entity Recognition using Machine Translation-based Cross-lingual Information
In this paper, we describe a technique to improve named entity recognition in a resource-poor language (Hindi) by using cross-lingual information. We use an on-line machine translation system and a separate word alignment phase to find the projection of each Hindi word into the translated English sentence. We estimate the cross-lingual features using an English named entity recognizer and the a...
متن کاملBuilding English-Vietnamese Named Entity Corpus with Aligned Bilingual News Articles
Named entity recognition aims to classify words in a document into pre-defined target entity classes. It is now considered to be fundamental for many natural language processing tasks such as information retrieval, machine translation, information extraction and question answering. This paper presents a workflow to build an English-Vietnamese named entity corpus from an aligned bilingual corpus...
متن کامل